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ABSTRACT 

All positive-stranded RNA viruses with genomes 
>'^7kb encode helicases, which generally are 
poorly characterized. The core of the nidovirus 
superfamily 1 helicase (HEL1) is associated with a 
unique N-terminal zinc-binding domain (ZBD) that 
was previously implicated in helicase regulation, 
genome replication and subgenomic mRNA synthe- 
sis. The high-resolution structure of the arterivirus 
helicase (nsp10), alone and in complex with a poly- 
nucleotide substrate, now provides first insights 
into the structural basis for nidovirus helicase 
function. A previously uncharacterized domain 18 
connects HEL1 domains 1A and 2A to a long linker 
of ZBD, which further consists of a novel RING-like 
module and treble-clef zinc finger, together 
coordinating three Zn atoms. On substrate binding, 
major conformational changes were evident outside 
the HEL1 domains, notably in domain 1B. Structural 
characterization, mutagenesis and biochemistry 
revealed that helicase activity depends on the ex- 
tensive relay of interactions between the ZBD and 
HEL1 domains. The arterivirus helicase structurally 
resembles the cellular Upf1 helicase, suggesting 



that nidoviruses may also use their helicases for 
post-transcriptional quality control of their large 
RNA genomes. 

INTRODUCTION 

Helicases and nucleic acid translocases are ATP-depend- 
ent motor proteins capable of moving along their nucleic 
acid substrates while either unwinding duplexed regions 
(helicases) or performing other functions (translocases), 
including protein displacement and the nucleation of 
larger RNA-protein complexes (1,2). These enzymes are 
known to be critical players in a wide variety of biological 
processes and are encoded by all organisms, as well as 
positive-stranded RNA (+RNA) viruses with genomes 
larger than about 7kb [(3); for reviews, see (4-6)]. On 
the basis of sequence comparisons, helicases/translocases 
have been classified into six superfamilies (SFl to SF6) 
(7,8), with +RNA viral helicases belonging to SFl, SF2 
or SF3. Based on the direction of translocation, helicases 
of various superfamilies have been divided into (biochem- 
ical) classes A and B, which translocate along their nucleic 
acid substrates in the 3'-5' or 5'-3' direction, respectively 
(7). In the case of SFl helicases (9,10), structurally 
characterized cellular enzymes of class B (SFIB) are 
further divided into the phylogenetically compact 
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Pifl-like (Pifl, RecD2), UvrD/Rep and Upfl-like (Upfl, 
Ighmbp2) groups, with the latter being able to unwind 
both DNA and RNA duplexes (11). 

Helicase SFl also includes a large number of (putative) 
helicases from a dozen +RNA virus families belonging 
to two diverse phylogenetic lineages, known as the 
alphavirus-like (or Sindbis virus-hke) supergroup (12) 
and the order Nidovirales (13). More detailed studies on 
the SFl helicases of two alphavirus-like viruses have 
recently been pubUshed. The helicase domain of the 
dendrolimus punctatus tetravirus (an insect virus form 
the Alphatetraviriclae family) was found to have dsRNA- 
unwinding activity with 5'-3' directionality (14). The 
helicase domain of the plant tomato mosaic virus 
(ToMV; family Virgaviridae) was not characterized 
enzymatically, but its crystal structure revealed the two 
canonical RecA-hke a/p domains (lA and 2 A) of the 
helicase core (15). Accessory domain insertions, an other- 
wise frequently observed phenomenon among cellular SFl 
helicases, are lacking in the ToMV helicase. The SFl 
helicases of nidoviruses (HELl), one of which is the 
focus of this study, were characterized in some detail 
using bioinformatics, molecular genetics and biochemistry 
(see below), but structural information was lacking 
thus far. 

Nidoviruses constitute an order of +RNA viruses 
composed of virus groups targeting a wide variety of 
mammalian, avian and invertebrate hosts. In mammals, 
nidovirus infection can be associated with severe respira- 
tory disease, as in the case of porcine reproductive and 
respiratory syndrome (PRRS) (16), one of the leading 
swine diseases (caused by arteriviruses), and zoonotic cor- 
onavirus infections in humans, hke severe acute respira- 
tory syndrome (17) and Middle East respiratory syndrome 
(18). The continuing outbreak of the latter disease is cur- 
rently attracting worldwide attention, in particular 
because of its ~40% case fatahty rate. Besides their patho- 
genic properties, nidoviruses have been studied for their 
extraordinarily large RNA genomes: even the shortest 
nidovirus genome (the 12.7-kb RNA of the arterivirus 
equine arteritis virus, EAV) outranks almost all other 
mammalian +RNA virus genomes, whereas coronavirus 
genomes (26.3-31.7 kb) are larger than those of any other 
RNA virus group. Their large genome size enabled 
nidoviruses to evolve substantial genetic complexity, 
which is evident from (among other properties) the acqui- 
sition of a variety of enzymatic activities and accessory 
proteins, many of which are lacking or rare in other 
+RNA viruses (19). These proteins appear to contribute 
to the regulation of the complex RNA synthesis of 
nidoviruses, which occurs exclusively in the cytoplasm of 
the infected cell, and to the elaborate array of virus-host 
interactions needed to support efficient virus rephcation 
(13,20). For example, nidoviruses with genomes >20kb 
use a proofreading 3'-5' RNA exonuclease that is 
proposed to promote the fidehty of viral RNA synthesis 
(19,21-27). However, it is completely unknown whether 
and how nidoviruses deal with translational quahty 
control during the expression of their large multicistronic 
genomic RNAs, which also serve as mRNAs for the 
synthesis of the viral replicative enzymes. 



Compared with other +RNA viruses, nidovirus replic- 
ase genes encode an exceptionally large number of non- 
structural proteins (nsps) (19,24,25,28). Nidovirus nsps are 
expressed from open reading frames (ORFs) la and lb, 
which make up the 5'-proximal 65-75% of the genome 
RNA. ORE la encodes polyprotein la (ppla; size 
ranging from 1728 to 4550 aa) and following a -1 riboso- 
mal frameshift ppla can be extended with the ORE lb- 
encoded polyprotein to give pplab (3175-7183 aa) (29) 
(Supplementary Figure SI). Both polyproteins are 
subject to extensive proteolytic processing by multiple, in- 
ternally encoded proteinases (19,30). The nidovirus replic- 
ase backbone consists of a conserved array of domains, 
arranged in a nidovirus-speciflc order and including 
the ORE lb-encoded RNA-dependent RNA polymerase 
(RdRp) and HELl domains, the core enzymes needed 
for genome RNA synthesis (replication) and subgenomic 
(sg) mRNA production (transcription). The latter process 
yields an extensive nested set of sg mRNAs, which are 
used to express up to a dozen structural and accessory 
proteins from smaller ORFs in the 3'-proximal part of 
the genome (31-33). In both corona- and arteriviruses, 
sg mRNAs contain a common leader sequence that is 
identical to the 5' end of the genome. Their generation 
from sg negative-stranded templates involves a mechanism 
of discontinuous negative strand RNA synthesis (31,32). 

Previous studies identified the nsp carrying RNA 
helicase activity (arterivirus nsp 10 and coronavirus 
nsp 13) as one of the two most evolutionarily conserved 
nidovirus proteins. Biochemical studies using recombinant 
arterivirus and coronavirus helicases revealed similar en- 
zymatic properties, including nucleic acid-stimulated 
ATPase and 5'-3' duplex unwinding activities on both 
RNA and DNA substrates containing 5' single-stranded 
regions (34,35). A unique nidovirus helicase feature is the 
presence of an N-terminal (predicted) complex zinc- 
binding domain (ZBD) of 80-100 residues. ZBD 
includes 12 or 13 conserved Cys/His residues (36) and is 
a nidoviral genetic marker not found in any other RNA 
virus group (19). ZBD is separated from the downstream 
HELl domain by an uncharacterized domain that varies 
in size and sequence between arteri- and coronaviruses 
(37). For the arterivirus prototype EAV, the significance 
of the nsp 10 ZBD was evaluated extensively using site- 
directed mutagenesis in combination with biochemical 
assays and reverse genetics. Amino acid substitutions in 
ZBD or the adjacent 'spacer' that connects it to the down- 
stream domain can profoundly affect EAV helicase 
activity and RNA synthesis, with most replacements of 
conserved Cys or His residues yielding replication- 
negative virus phenotypes (36,37). Intriguingly, some 
mutations in the spacer region selectively inactivated tran- 
scription, while not affecting replication (36,38), strongly 
suggesting a specific role for nsp 10 in the unique mechan- 
ism of discontinuous sg RNA synthesis. 

Despite its importance as a key replicative enzyme and 
antiviral drug target (39), no 3D structural information 
has been reported for any nidovirus helicase. To under- 
stand the regulatory role of ZBD and the protein's inter- 
action with nucleic acids, we characterized the structure of 
a helicase-competent derivative of EAV nsp 10, alone and 
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in complex with poly(dT). The muhi-domain nsplO 
includes the canonical lA and 2A core domains of a 
SFl helicase, a flexible accessory domain that is sensitive 
to nucleic acid binding, and a complex ZBD displaying a 
novel structural organization. Strikingly, the protein was 
found to bear structural resemblance to the eukaryotic 
Upfl helicases, which are multi-domain proteins 
involved in RNA quahty control, including nonsense- 
mediated niRNA decay (40). Thus, our study not only 
highlights how nidovirus helicase activity depends on the 
extensive relay of interactions between the ZBD, accessory 
and HELl domains but also provides a framework to 
propose and explore a role for the enzyme in the post- 
transcriptional quahty control of nidovirus RNAs. 

MATERIALS AND METHODS 

Cloning, expression and purification of soluble EAV nsplO 

NsplO of the EAV-Bucyrus isolate (NCBI Reference 
Sequence NC_002532) is composed of amino acids 2371- 
2837 of replicase pplab, which will throughout this study 
be referred to as nsplO residues \-^61. The full-length 
nsplO sequence or a C-terminally truncated version 
comprising residues 1^02 (nsplOA) were cloned into a 
modified pET28a vector with a tobacco etch virus (TEV) 
protease cleavage site. Mutations were generated using the 
QuikChange protocol and confirmed by DNA sequencing. 
The proteins were overexpressed at 37°C in Escherichia coli 
strain BL21 (DE3) grown to an ODgoo of ~0.8 in Luria- 
Bertani medium in the presence of 50 |ig/ml kanamycin. 
Protein expression was induced with 0.2 mM isopropyl P- 
D-l-thiogalactopyranoside for 12 h at 16°C. Cell pellets 
were resuspended in lysis buffer (20 mM HEPES, pH 7.0, 
for nsplOA or pH 8.0 for full-length nsplO, 500 mM NaCl 
and 30 mM imidazole), supplemented with protease inhibi- 
tor cocktail (Roche) and disrupted by sonication. Lysates 
were clarified at 20 000^ for 30 min and the soluble fraction 
was applied to a Ni^^ chelating column. After sample 
loading, the column was washed (20 mM HEPES, pH 7.0 
or 8.0, 500 mM NaCl and 60 mM imidazole) and the 
protein was eluted (20 mM HEPES, pH 7.0 or 8.0, 
500 mM NaCl and 400 mM imidazole). Proteins intended 
for ATPase or helicase assays were dialysed against storage 
buffer (20niM HEPES, pH 7.0 or 8.0, 100 mM NaCl, 50% 
glycerol) and stored at — 20°C. Truncated protein for 
crystallization studies was digested with 10% (w/w) TEV 
protease to remove the His-tag. Further purification was 
performed by size-exclusion chromatography using a 
Superdex 200 column (GE Healthcare) with GF buffer 
(20 mM HEPES, pH 7.0, 500 mM NaCl). The peak 
fraction was collected and analysed by sodium dodecyl 
sulphate-polyacrylamide gel electrophoresis. 

Crystallization and data collection 

Purified nsplOA was concentrated to lOmg/ml and initial 
crystallization trials were performed at 16°C using the 
sitting-drop vapour-diffusion method by mixing 1 ^1 of 
protein solution with 1 ^l of reservoir solution. The con- 
ditions were then optimized and high-quality crystals were 
obtained in 1.6 M (NH4)2S04, 0.1 M HEPES, pH 7.1, 



25 mM KCl and 20% ethylene glycol. To obtain crystals 
of the protein-DNA complex, purified protein and par- 
tially double-stranded DNA with a 5' single-stranded 
poly-thyniidine overhang (the two partially complemen- 
tary sequences were 5'-TTTTTTTTTTGCAGTGCT 
CG-3' and 5'-CGCGAGCACTGC-3') were mixed in a 
1:1.5 molar ratio and incubated at 4°C overnight. The 
complex was further purified by size-exclusion chromatog- 
raphy (Superdex 200, GE Healthcare) and concentrated to 
5mg/ml. The condition for obtaining crystals was 14% 
PEG 3350, 0.1 M HEPES, pH 7.0, and 0.2 M calcium 
acetate. For data collection, crystals were cryoprotected 
in mother liquor containing 25% (v/v) ethylene glycol 
and flash cooled to -173°C. 

The multi-wavelength anomalous diffraction (MAD) 
data for intrinsic zinc atoms were collected on beamhne 
1W2B at the Beijing Synchrotron Radiation Facihty. The 
data for EAV nsplOA and its complex with DNA were 
collected at beamline NE3A at Photon Factory (KEK) 
and beamhne BL17U1 at the Shanghai Synchrotron 
Radiation Facility. Data was indexed, integrated and 
scaled using HKL2000 (41). Data collection and process- 
ing statistics are summarized in Table 1. 

Structure determination 

The structure of nsplOA was determined by the MAD 
method. Initial phases were calculated by SOLVE, and 
phases were subsequently improved using RESOLVE 
(42). The figure of merit from the MAD phasing 
was 0.36 and the Z score was 15.7. Several segments of 
the protein could be automatically modelled into the 
electron-density map by RESOLVE, although in part 
only as poly-alanine chains. Manual rebuilding was per- 
formed in COOT (43), and refinement was performed with 
REFAMC5 (44). Further rounds of refinement were done 
with Translation/Libration/Screw (TLS) refinement (45). 
The structure was refined to 2.0 A with an Rwork of 19.5% 
and an Rfree of 22.4%. 

Using the structure of free nsplOA without domain IB 
as input model, the structure of nsplOA in complex with 
DNA was successfully solved by molecular replacement. 
The initial model was obtained by MOLREP from the 
CCP4 program suite (46). A good match for domains 
ZBD, lA and 2A with electron density was found. 
Domain IB was manually added with the aid of IFo-F^ 
and Fq-F^ maps using COOT (43). DNA molecules were 
included in the final stages of refinement. Difference 
Fourier maps clearly showed electron densities for seven 
bound deoxyribonucleotides. The final model was refined 
to 2.65 A with an Rwork of 23.2% and an RfVee of 25.7%. 
All figures in this article displaying molecular structure 
were made using PYMOL (47). 

RESULTS 

C-terminally truncated EAV nsplO retains ATPase and 
helicase activity 

Fufl-length EAV nsplO and a series of truncated variants 
were overexpressed in and purified from E. coli. After ex- 
tensive crystallization trials, diffracting crystals could only 
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Table 1. Data collection and refinement statistics of nsplOA and the nsplOA-DNA complex" 
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be obtained for a truncated form of nsplO (aa 1^02) 
lacking the 65 C-terminal residues. For simplicity, we 
will hereafter refer to this protein as nsplOA, which was 
used throughout this study unless otherwise specified. To 
verify that nsplOA, which contained all characteristic SFl 
helicase sequences (motifs), is enzymatically active, we 
performed in vitro enzyme assays to compare full-length 
and truncated nsplO. In agreement with previously pub- 
lished results (35), full-length nsplO displayed only weak 
ATPase activity in the absence of nucleic acid, but was 
strongly stimulated by the addition of poly-uridine 
(polyU). In the absence of polyU nsplOA showed a 
5-fold higher ATPase activity than the full-length 
protein (Figure lA), yet this increased ATP turnover did 
apparently not translate into increased hehcase activity. 
Unwinding of a partially double-stranded DNA substrate 
by nsplOA was incomplete, but went to completion when 
using full-length nsplO (Figure IB). As expected, replace- 
ment of the conserved lysine of the Walker A motif, which 
is essential for ATP hydrolysis (35), with glutamine 
(mutant K164Q) completely abolished ATPase and con- 
sequentially also helicase activity. This confirmed that the 
observed activities could be completely attributed to the 
recombinant EAV proteins used, rather than to potential 
trace amounts of contaminating bacterial enzymes. 

The observed enzymatic differences between nsplO and 
nsplOA may be caused by the latter' s truncation and 
could, in principle, be explained by one or multiple 
defects, like decreased unwinding velocity and/or 
processivity, loss of affinity towards the substrate or 



uncoupling of ATPase from helicase activity. The results 
of the ATPase assay lead us to propose that the observed 
reduction of duplex unwinding may be due to unproduct- 
ive ATP hydrolysis, originating from the fact that the 
ATPase reaction is independent of nucleic acid substrate 
binding. Accordingly, the input ATP in the nsplOA assay 
may have been depleted before complete unwinding was 
achieved. Regardless of which interpretation is correct, the 
C-terminal 65 amino acids clearly are dispensable for the 
helicase activity of EAV nsplO. This result is in good 
agreement with the fact that the truncated protein 
retained all HELl key domains (Figure 2A) previously 
shown to be evolutionary conserved and essential in 
both in vitro enzyme assays and in vivo studies with virus 
mutants. 

The crystal structure of EAV nsplOA reveals a 
multi-domain organization of the arterivirus 
replicative helicase 

Because 3D structures of orthologous proteins were not 
available, we took advantage of the zinc-binding 
properties of nsplO and used the zinc multiple-wavelength 
anomalous dispersion (MAD) method (42) to solve the 
EAV nsplOA structure. The presence and position of 
three zinc atoms were estabhshed with anomalous data 
collected from the zinc absorption edge (Table 1). The 
final model included EAV nsplO residues 1-401, three 
zinc ions in the N-terminal ZBD, five sulphate ions and 
267 water molecules. 
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Figure 1. EAV nsplO in vitro enzymatic activity assays. (A) ATPase activity of full-length EAV nsplO, the C-terminally truncated nsplOA (aa 1^02) 
and respective active site mutants carrying a Lys-164 to Gin substitution in their Walker A box were analysed as described in Supplementary 
Experimental Procedures. In the absence of nucleic acid, ATPase activity was measured by incubation at 20°C for 0, 5, 15 and 30min, respectively. 
ATPase activity was strongly stimulated by the presence of polyU, as measured by incubation at 20°C for 5 min. (B) Helicase activity of full-length 
nsplO and nsplOA, and their respective K164Q mutants. Activity was determined with the indicated DNA substrate (the asterisk marks the position 
of the radioactive label). Samples were incubated for 0 or 30 min at 30°C. Control samples without protein or ATP were incubated for 30 min. 




Figure 2. Overall structures of EAV nsplOA and the nsplOA-DNA binary complex. (A) Domain organization of EAV nsplO depicting the 
N-terminal ZBD (yellow), the two RecA-like domains lA (green) and 2A (cyan) of HELl and an additional regulatory domain IB (magenta). 
Structure of (B) free and (C) nucleic acid-bound nsplOA. Also the F„—Fc differential electron density map of the bound single-stranded part of a 
partially double-stranded DNA substrate at 2.5 a is presented. The putative ATP binding site is shown as a brown oval. 



Two RecA-like a/p domains (lA and 2A) form the 
structure's C-terminal part (Figure 2B; cyan and green) 
and constitute the helicase core (HELl). Domain lA 
contains a parallel five-stranded P-sheet that is sandwiched 
by three a-helices on one side and two a-helices on the 
other. Domain 2A contains a parallel four-stranded 
P-sheet with five a-helices on the side facing domain lA. 
Upstream of domain lA, we identified an additional 
domain with a characteristic P-barrel fold (Figure 2A 
and B; magenta). It consists of five P-strands arranged 
as two tightly packed anti-parallel p-sheets and is 
juxtaposed to domain lA (Figure 2B). The location of 
this domain in the protein sequence and its orientation 
relative to the HELl domain resemble those of domain 
IB in helicases of the SFIB Upfl-like subfamily 
(Figure 3B and C), and it was therefore named accord- 
ingly in our nsplOA structure. The domain has no coun- 
terpart in the only other solved structure of a viral SFl 



helicase, that from ToMV (Figure 3A) (15), whereas its 
counterpart in helicases of the Pifl-hke subfamily is 
inserted in domain 2A (Figure 3D) (48). 

Our structure further revealed that the N-terminal ZBD 
(Figure 2; yellow) has a compact fold containing three 
structural zinc atoms. Based on secondary structure 
analysis with DIAL (49), we could partition ZBD into 
three elements (Figure 4). Two adjacent and structurally 
different zinc fingers, an N-terminal RING-hke module 
(residues 1-40, pink) and a treble-clef zinc finger 
(residues 41-65, red) constitute the main body of ZBD. 
The third element is a C-terminal Knker region (Linker 1) 
that includes the long loop L7, which crosses the entire 
domain, and helix a4 (residues 66-82, yellow), which 
connects the two zinc fingers with domain IB 
(Figure 4A). This classification is further supported by 
the observation that the connecting residues between the 
RING module and treble-clef zinc finger are disordered 
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Figure 3. Structural comparison of EAV nsplOA with selected SFl helicases. (A) ToMV-HEL (pdb code: 3vkw), (B) EAV nsplOA, (C) hUpfl (pdb 
code: 2wjy) and (D) RecD2 (pdb code:3gp8). Domain colours are the same as in Figure 2. 



(Supplementary Figure S2 and Figure 4D). Only 12 out of 
the 13 Cys/His residues are involved in zinc binding, 
rather than all 13 residues as proposed previously [(36); 
Figure 4B and C]. Not involved is His34, which is 
not conserved in other arteri- and coronaviruses 
(Supplementary Figure S3B). 

The N-terminal RING-hke module has a notable 
binuclear structure with a cross-brace topology involving 
6 Cys and 2 His residues that coordinate two zinc ions 
(Figure 4A). A three-stranded antiparallel P-sheet (pi-p3) 
sits in the centre and packs against helix al following p2 
(Figure 4B). The first zinc ion (Znl) is coordinated by four 
cysteine residues (Cys4, Cys7, Cys22 and Cys25) within a 
treble-clef zinc finger-like motif. Residues Cys4 and Cys7 
are provided by the zinc knuckle within loop LI, whereas 
Cys22 is positioned at the C-terminus of (32 and Cys25 
comes from the N-terminus of helix al. The second zinc 
ion (Zn2) is coordinated by residues Cysl7, Cys33, His29 
and His32, which are arranged in an app zinc finger-like 
motif The second pair of the zinc-coordinating residues of 
both zinc-binding motifs of the RING module may include 
both His and Cys residues in other arteri- and 
coronaviruses. Overall, the RING module of these viruses 
can be described by a characteristic conserved Cys2A-CysB- 
Cys[His/Cys]A-[His/Cys]3B pattern (where applicable, 
A and B refer to residues chelating the first and second 
zinc ion, respectively; brackets indicate positions at which 
His and Cys can alternate). 

The C-terminal zinc finger of ZBD adopts a treble-clef 
fold distinct from that of the RING module (see above; 
Figure 4C). Two one-turn helices a2 and a3 are stabilized 
by a zinc atom (Zn3) that is chelated by residues Cys42 
and His44 of a Zn-knuckle within loop L5, while Cys53 
and Cys56 originate from L6 and a3, respectively. An 
extensive array of hydrogen bonds is observed between 
the main chains of residues in loop L7 and Thr54 in a3 
(Figure 4D). These multiple hydrogen-bonding inter- 
actions play a major role in the formation of a compact 
zinc finger. Arteri- and coronaviruses appear to tolerate 
replacements (Cys for His, or vice versa) at the second and 
fourth residues of this finger (36,37), which can be 
described by the characteristic, conserved C[H/C]C[C/H] 



pattern. Finally, Linker 1 includes only one structured 
element (a4), but it plays a central role in the interaction 
between the main body of ZBD and HELl, as detailed 
below. 

The structural basis for the essential role of ZBD in EAV 
nsplO helicase function 

Previously, ZBD mutagenesis demonstrated the in vitro 
and in vivo importance of this domain for nsplO enzyme 
activities, genome replication and transcription, and 
arterivirus viability. The solved structure now provides 
us with a structural basis for these observations. ZBD 
packs against the HELl domains through extensive 
hydrophobic and hydrophilic interactions (Figure 5A 
and B). Specifically, residues Leul38, Vall41, Vall43, 
Leul47, Pro247, Val248, Leu280 and Trp281 in domain 
lA together with residues Ile71, Leu72, Leu75, Leu76 and 
Ile79 from a4 in ZBD create an extensive hydrophobic 
surface. The total interface area between ZBD and the 
HELl is 1019 A^, as determined by Protein Interfaces, 
Surfaces and Assemblies (PISA) server (50). A major 
part of this interface involves the a4 hehx, which is 
located in a groove formed by two helices and a loop of 
domain lA, while making extensive contacts to the main 
body of ZBD and, to lesser extent, domain IB (Figure 5). 
The interface areas between a4 and domain lA, on the one 
hand, and the ZBD fingers (including zinc ions) on the 
other hand, are 558.1 and 402.4 A^, respectively. In 
addition, four hydrogen bonds between ZBD and the 
HELl enhance the interaction (Figure 5B), and a salt 
bridge is observed between His78 in ZBD and Asp 136 in 
domain IB (Figure 5B). The large size of these interface 
surfaces and the large number of interactions suggest the 
existence of a signalling network through which ZBD 
could affect both the fold and activity of HELl. 

The proposed signalling network can now be used to 
rationalize, in a structural context, the previously reported 
phenotypes of EAV ZBD mutants carrying replacements 
of residues not directly involved in Zn-binding. For 
instance, a replication-negative phenotype was described 
for mutant D45A (36). It is now clear that Asp45 forms 
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Figure 4. Structural characterization of the EAV nsplOA ZBD. (A) Topology of the ZBD with its RING-like module (pink), treble-clef zinc finger 
(red) and Linker 1 (yellow) indicated. (B) Structure of the RING-like module and (C) treble-clef zinc finger. The residues coordinating the Zn^^ ions 
are shown as sticks. (D) Interactions between the RING-like module and the treble-clef zinc finger. (E) Superposition of the RING-like modules of 
EAV nsplO (pink) and hUpfl (pdb code:2wjy; grey). (F) Sequence alignment of ZBD with the CH domain of hUpfl. 



two hydrogen bonds with the main and side chain of 
Thr35 and electrostaticahy interacts with the side chain 
of His34, wliich both belong to the RING-hke zinc 
finger (Figure 4D). Replacement of Asp45 may thus 
greatly reduce these interactions and disrupt ZBD integ- 
rity, potentially affecting the structural integrity of the 



HELl. Another residue, Ser59, was probed extensively 
by mutagenesis after the finding that a virus mutant 
(EAV030F) carrying a S59P mutation rephcates its 
genomic RNA with wild-type efficiency, while being com- 
pletely defective in sg mRNA synthesis (38). This tran- 
scription-negative phenotype was attributed to the severe 
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Figure 5. Inter-domain interactions of ZBD and HELl domains lA and IB. (A) Overview of the spatial orientation of the essential interaction helix 
ot4 of ZBD. (B) Close-up view of the domain interface. Residues engaged in interactions are shown as sticks. Domain colours are the same as in 
Figure 2A. (C) DNA-binding assay of EAV nsplOA mutants with reduced Zn^^-binding capabilities. Position of free DNA and protein-DNA 
complexes are indicated by blue and red asterisk, respectively. 



Structural constraints exerted by Pro residues on the local 
conformation of the proposed hinge region, as various 
substitutions of Ser59 alone (to Ala, Cys, Gly, His, Leu 
or Thr) yielded virus mutants with a wild-type phenotype, 
while combining the neutral S59G mutation with a P60G 
substitution reproduced the specific defect in sg mRNA 
synthesis (36). This interpretation is now further sup- 
ported by the nsplOA structure in which Ser59 and 
Pro60 are located in the hinge connecting the treble-clef 
zinc finger and a4 of ZBD. The main chain of Ser59 forms 
three hydrogen bonds with the treble-clef Thr54, which is 
also connected to the Pro60 side chain and Lys61 main 
chain (Figure 4D). Owing to the unique properties of the 
Pro residue, the Ser59-to-Thr54 bonds are likely disrupted 
by the S59P mutation, but are not affected by the alter- 
native replacements tested. Consequently, also owing to 
the main chain rigidity associated with the introduction 
of a Pro residue, the orientation of a4 relative to lA 
and/or the main body of ZBD is hkely affected in 
mutant S59P, which carries adjacent Pro residues at pos- 
itions 59 and 60. Likewise, the introduction of two Gly 
residues at these positions [double mutant S59G/P60G; 
(36)] probably gives rise to excessive flexibility of the 
hinge region, thus compromising nsplO function in a 
similar manner. 

To further explore the role of ZBD, we tested the effect 
of four mutations (C25A and H29A in the RING-hke 
module; H44A and C53A in the treble-clef zinc finger) 
expected to affect the ability to bind Znl, Zn2 or Zn3, 
respectively. In agreement with the proposed structural 
role of these zinc ions, soluble His-tagged proteins con- 
taining these mutations could not be obtained and only 
low yields of GST-nsplO fusion proteins carrying the same 
mutations could be recovered. For mutants C25A and 
H29A, band shift analysis revealed a complete loss of 
binding to a partially double-stranded DNA substrate 



containing a 5' single-stranded poly(dT) overhang (sub- 
strate 5'DNA-TIO; Figure 5C, lane 3^). These results 
complement previous findings, showing a complete loss 
of both ATPase and helicase activity for these mutants 
(37). In contrast, the level of nucleic acid binding by 
mutants H44A and C53A was comparable with that of 
the wild-type protein (Figure 5C, lanes 5-6), consistent 
with nsplO-H44A retaining a limited level of ATPase 
and helicase activity (37). On further testing, we 
observed that the addition of 40 mM EDTA altered the 
overall conformation of nsplOA, as detected by changes in 
circular dichroism (Supplementary Figure S4A), and 
reduced its binding to 5'-DNA-A10 (Supplementary 
Figure S4B). In summary, these results reveal that ZBD 
interacts extensively with the HELl domain and that its 
integrity is an essential determinant of nsplOA properties 
in in vitro assays. 

Structural resemblance between EAV nsplOA and mRNA 
decay factor Upfl 

Next we analysed the existence of structural similarity 
between EAV nsplOA and other proteins by scanning a 
protein data bank using the DALI server (51). The struc- 
ture of the nsplOA HELl domain was found to be most 
similar [Z score, 20.9; root-mean-square deviation 
(RMSD), 3.5 A] to the helicase core of nonsense- 
mediated mRNA decay factor Upfl and its homolog 
Ighmbp2 (Z score, 19.9; RMSD, 3.0 A), which both 
belong to the Upfl -like helicase subfamily (11). Further 
comparisons revealed that this resemblance extends into 
the respective N-terminal ZBDs: the binuclear RING-hke 
module of nsplOA ZBD was found to be most similar to 
RING-hke module 1 in the CH-domain of Upfl (Figure 
4E). This similarity was rather limited (Z-score of 1.9 and 
RMSD of 2.2 A) because only six out of the eight Zn- 
chelating residues in the two domains could be juxtaposed 
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(Figure 4F) and because loops LI, L3 and helix al in 
nsplOA are shorter than the corresponding elements in 
Upfl. We did not detect significant similarity of the 
treble-clef zinc finger with other proteins, although we 
note that the Upfl CH-doniain also has a zinc finger 
(but of a different fold) downstream of the RINGl 
module. Thus, EAV nsplO ZBD prototypes a novel and 
complex multi-domain zinc finger with distinct structural 
properties. On the other hand, EAV nsplO and Upfl share 
a similar domain organization, including structurally 
similar RING and helicase domains. These similarities 
are further enhanced by the 5'-3' directionality of duplex 
unwinding shared by both these helicases and likely 
extends to other nidovirus helicases in view of 
the observed sequence conservation (Supplementary 
Figure S3B). 

Structure of EAV nsplOA in complex with a nucleic 
acid substrate 

We proceeded to solve the crystal structure of nsplOA in 
complex with a nucleic acid substrate. Nidovirus RNA 
helicases, including EAV nsplO, were previously found 
to lack the abihty to discriminate between RNA and 
DNA substrates, a property shared with only a few 
other helicases (34,35). This substrate promiscuity 
allowed us to use a partially double-stranded DNA sub- 
strate (5'DNA-TIO) containing a 5' single-stranded 
poly(dT) overhang for crystallographic studies. The 
binding of this substrate was deduced from an increase 
of the protein's Stokes radius in gel filtration chromatog- 
raphy (Supplementary Figure S5). ^The binary complex 
diffracted to a resolution of 2.65 A in space group PI 
and was solved by molecular replacement (Table 1). 
Continuous electron density was found in the enzyme's 
binding pocket (Figure 2C), which apparently corres- 
ponded to seven thymidine residues. This part presents 
in an extended conformation and lies in a channel 
formed by domains lA, IB and 2A, with its 5' end in 
domain 2A and its 3' end in domain lA. The remaining 
three unpaired thymidines and the entire double-stranded 
portion of the substrate could not be located. The asym- 
metric unit contained four nsplOA-DNA binary 
complexes with a Matthews coefficient of 2.73 A^'/Da, cor- 
responding to a solvent content of 55%. These complexes 
shared a remarkably similar spatial arrangement with the 
RMSD of their Ca atoms being only 0.8 A. Several con- 
necting residues between subdomains were missing in the 
structure of the complex, indicating apparent structural 
flexibihty of these residues. 

Nucleic acid binding induces profound conformational 
changes outside the HELl domain of nsplOA 

The Cot atoms of domains lA and 2A of free nsplOA and 
the nsplOA-DNA complex can be superimposed with an 
RMSD of 0.6 A, indicating that the relative orientations of 
these core domains are barely affected by DNA binding 
(Figure 6A). However, outside these domains, the effect of 
DNA binding was considerable, with the RMSD between 
the Ca atoms of the two forms of nsplOA increasing to 
1.8 A. Particularly large conformational changes were 



observed in domain IB, which rotates ~28° towards 
ZBD in the nsplOA-DNA complex (Figure 6A). The 
RMSD between the Ca atoms of the two forms of 
domain IB is 1.8 A, with loop residues being affected 
most profoundly (Supplementary Figure S6A). Both 
width and height of the polynucleotide substrate channel 
formed by domains lA and IB (originally ~5 and 11 A, 
respectively) are increased by 2 A on this rotation. This 
reorganization makes this channel large enough to 
accept single-stranded nucleic acids, although it remains 
too narrow for a nucleic acid duplex (Figure 6B). 
Consequently, double-stranded nucleic acids must be 
unwound at the entrance of the substrate channel to let 
a single-stranded chain enter. Besides this large conform- 
ational change, temperature factor calculations sug- 
gest that the regions at the surface of domain IB not 
directly involved in DNA binding may become flexible 
(Supplementary Figure S2). For example, domain IB 
residues Arg95, Glyl25 and Alal31 become disordered 
after DNA binding (Figures 2C and 3B and 
Supplementary Figure S2). 

On DNA binding, a structural change was also 
observed in the treble-clef zinc finger of ZBD, as reflected 
by its relatively high temperature factor (compared with 
that of domains 1 A and 2A) in the nsplOA-DNA complex 
as opposed to nsplOA alone (Supplementary Figure S2). 

Substrate recognition by EAV nsplOA is 
sequence-independent 

As outlined above, the single-stranded part of the DNA 
substrate is bound to a nucleic acid-binding channel 
formed by domains lA, IB and 2A (Figure 2C). The 
backbone phosphates of the poly(dT) are located on top 
of domains lA and 2 A, with the thymine bases exposed to 
the solvent (Supplementary Figure S7A). The majority of 
contacts with the bound DNA are made via the phospho- 
diester backbone and non-specific protein-base inter- 
actions as depicted in Figure 7. Consistent with this 
observation, the base orientation varies in the four EAV 
nsplO-poly(dT) complexes of the asymmetric unit, while 
the position of the DNA backbone is rather rigid 
(Supplementary Figure S7B and C). Several key residues 
from domains lA and 2 A contact the DNA backbone in 
the channel of the protein (Figure 7A and B). Base Tl, 
the most 5' one, is exposed to the solvent and protrudes 
outwards, causing a bend in the DNA backbone between 
Tl and T2. The bases T2 and T3 as weU as T5 and T6 
stack with each other at an average distance of 3.7 A. In 
contrast, base T4 is almost perpendicular to T3, with its 
edge exposed to protein side chains that make specific 
contacts. Val271 in domain lA forms van der Waals 
contacts with the base and the sugar ring of T4 and thus 
stabilizes the DNA conformation. Moreover, the binding 
is stabilized by several hydrogen bonds between His 186, 
His339, Thr348, Ser351 and the backbone of the DNA, 
and by van der Waals contacts between Thrl85, Leu227, 
Val230, Tyr338 and the phosphate groups of the DNA. 
While the interactions described above do not involve 
specific bases, six further interactions specific for 
thymine were found. For example, the backbone NH of 
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Figure 6. Conformational changes of EAV nsplOA on nucleic acid binding. (A) Comparison of the free and DNA-bound states of nsplOA. The 
arrow indicates the movement of domain IB (cartoon) in the DNA-bound state compared with the free state. (B) Surface model of the channel 
formed by domains lA and IB in the DNA-bound state and (C) the DNA-free state. Domain colours are the same as used in Figure 2A. Note that 
the DNA in Figure 5C was extracted from the complex structure of DNA-bound state. 
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Figure 7. Interactions between EAV nsplOA and a DNA substrate. (A) Stereo view of the nucleic acid-binding pocket of nsplOA. Bound single- 
stranded DNA and the interacting residues are shown as sticks. Nucleotides are numbered (Tl to T7) in the 5'-3' direction and are shown in orange. 
Residues are coloured according to their domain origin as indicated in Figure 2A. (B) Schematic representation of the contacts between nsplOA 
residues and DNA. 



Argl02 forms a hydrogen bond with the 04 atom of T6. 
The 02 and 04 atoms of base T3 form hydrogen bonds 
with the side chains of Asp350 and TyrSl. Also, several 
residues, such as Arg374 and Gln229, interact with both 
the base and the sugar ring. However, no interaction 
was observed between nsplOA and position C2' of the 
ribose ring of the DNA substrate. This observation may 
explain why EAV nsplO has the ability to unwind both 
DNA and RNA, in agreement with the substrate specifi- 
city observed for other helicases (52,53) possessing or 
lacking the ability to interact with the 2' OH moiety of 
the RNA backbone. 



Disscusion 

Among +RNA viruses, whose RdRps generally have a 
high error rate, nidoviruses stand out for their large to 
very large genome size (13-32kb). Consequently, the rep- 
hcation fidelity of nidoviruses, in particular coronaviruses, 
has been the subject of intense study. Most recently, the 
identification of a unique 3'-to-5' exoribonuclease (ExoN) 
activity has provided the basis for the hypothesis that a 
primitive proofreading mechanism operates to promote 
the fidehty of RNA-dependent RNA synthesis in 
nidoviruses with >20kb genomes (21-27). 



Despite this recent progress, the two central subunits of 
the nidovirus replicase, the RdRp and the unique ZBD- 
containing RNA helicase, have remained poorly 
characterized, also due to the lack of structural informa- 
tion. Remarkably, our present analysis of the arterivirus 
helicase structure revealed a number of important 
similarities with Upfl helicases, eukaryotic enzymes 
involved in quality control of RNAs through multiple 
pathways, including nonsense-mediated mRNA decay 
(54-56). In contrast to the ExoN-driven control of repH- 
cation fidehty (see above), the possibility of post-transcrip- 
tional quality control of nidovirus mRNAs has not been 
considered thus far. Yet, replicase ORE lab is extremely 
large (from 3175 to >7000 codons) and its correct expres- 
sion by translation of the viral genome is a critical first 
step in the production of the enzymes directing genome 
replication and expression. Therefore, our study not only 
provides the first insights into the structural basis for 
nidovirus RNA helicase function, but also creates a 
basis to propose a role for this protein in the post- 
transcriptional quality control of viral mRNAs. This 
role may be common to aU nidoviruses, regardless of 
their genomes size, which would distinguish it from the 
ExoN-based proofreading mechanism that appears to be 
restricted to nidoviruses with a >20kb genome. On the 
time scale of nidovirus evolution, the acquisition of 
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ZBD-HELl may have been a critical event to facilitate the 
genome expansion of ancestral small-sized nidoviruses, 
thus setting the stage for the subsequent ExoN-driven 
expansion towards even larger nidovirus genomes (19,57). 

EAV nsplO represents a multi-domain helicase conserved 
in nidoviruses 

Previously, using bioinforniatics, biochemistry and mo- 
lecular genetics, it was established that nsplO of 
arteriviruses and its orthologs in other nidoviruses are 
multi-domain proteins. Of its domains, ZBD and the 
HELl domains are critical for the enzyme's ATPase and 
helicase activities in vitro and for the regulation of viral 
replication and transcription in infected cells. Our struc- 
tural and biochemical studies extended the characteriza- 
tion of known domains and dehneated two hitherto 
uncharacterized domains: one (domain IB) flanked by 
ZBD and HELl, and the other (C-terminal domain) 
located downstream of the HELl, with its structure 
remaining to be solved. Our data show that, along with 
ZBD, these two non-enzymatic domains may regulate 
HELl function. Given that nsplO/nspl3 is one of only 
three proteins whose nidovirus-wide conservation can be 
detected at the sequence level (19,24,25,28), the nsplOA 
structure should be applicable to other nidovirus helicases, 
including those of PRRS viruses and coronaviruses. 
However, considerable size differences exist between 
arteri- and coronaviruses in the most conserved ZBD 
and HELl domains, whereas the IB and C-terminal 
domains lack appreciable sequence conservation. Thus, 
helicase structures from other small- and large-genome 
nidoviruses will be required to fully understand the 
enzyme's function. 

The nsplO C-terminal domain: coupling ATPase and 
helicase activities? 

While attempting to solve the EAV nsplO structure, we 
were confronted with the low stabihty of the full-length 
recombinant protein expressed in E. coli. We solved this 
problem by characterizing the C-terminally truncated 
nsplOA, which lacks the 65 residues (C-terminal 
domain) downstream of the known HELl motifs. This 
protein was found to bind partially double-stranded 
DNA and display the previously reported in vitro 
ATPase and hehcase activities. Because, compared with 
full-length nsplO, nsplOA appeared to be somewhat 
more active as an ATPase but somewhat less active as a 
helicase, the C-terminal truncation may have affected the 
coupling of these two enzymatic activities. This suggests 
that the C-terminal domain may have evolved to (co)regu- 
late nsplO helicase-mediated functions in vivo, implying 
that it must be able to communicate with the nsplO 
active site. This could be achieved either directly, by inter- 
acting with the nucleic acid; or ATP-binding site (the 
nsplOA C-terminus is ~22.5A apart of the active centre; 
Figure 2C), or indirectly, through a protein signal trans- 
duction network. Importantly, the C-terminal domain is 
poorly conserved among arteri- and coronaviruses in 
terms of both sequence and size (Supplementary 
Figure S3A, and data not shown), arguing that such a 



putative regulatory function could be executed in a 
virus- and, possibly, host-specific manner. 

The nsplO structure: defining a complex ZBD 

Our characterization of the EAV nsplO structure verified 
and revised a model of the N-terniinal ZBD based on 
prior studies (36,37,58). It resolved the uncertainty about 
the number of zinc ions bound (now estabhshed to be 
three) and the fold of this domain (a unique structure 
combining a RING-like module fused with a treble-clef 
zinc finger). Furthermore, it redefined the C-terminal 
border of ZBD and placed it 13 residues downstream to 
include a third hitherto unrecognized structural element 
(hehx a4). Previously, we analysed a variety of EAV 
nsplO ZBD mutants in which putative zinc-binding 
residues were replaced in a manner (Cys^His or 
His^Cys) that could preserve zinc binding (36,37). 
From the solved structure, it is now apparent that the 
replication-negative phenotypes of these virus mutants 
can likely be attributed to the detrimental impact of the 
respective mutations on ZBD integrity and, through the 
extensive interaction network, HELl domains. It pres- 
ently remains unclear why the replacement of His44 by 
Cys in the treble-clef zinc finger was partially tolerated. 
On the other hand, structural superposition of the RING- 
like modules of nsplO and hUpfl (Figure 4E) reveals how 
the only other similarly tolerated replacement (36,37), that 
of the Znl -coordinating Cys25 by His (found in the 
equivalent position in hUpfl), could be accommodated 
by nsplO. The RING-hke module 1 of Upfl also shares 
structural similarity with RING-box domains of E3 ubi- 
quitin ligases (59) and the involvement of this module in 
self-ubiquitination of Upfl was indeed demonstrated (60). 
It would be interesting to see whether these results are 
relevant for nsplO and its ZBD. Recently, arterivirus 
papain-like protease 2 was found to have deubiquitinase 
activity, which suppresses the innate immune response in 
infected host cells (61,62). 

The nsplO-nucleic acid complex: towards the dsRNA 
unwinding mechanism 

To understand how nsplO unwinds its natural dsRNA 
substrates, we analysed a complex of nsplOA with a par- 
tially double-stranded DNA substrate. Only seven thymi- 
dine residues could be confirmed in the structure of that 
complex (Figure 2C). The DNA-bound nsplOA structure 
revealed two possible RNA-binding clefts at the surface of 
nsplO, which are formed by domains IB and ZBD (named 
putative exit site 1), and lA and ZBD (putative exit site 2), 
respectively (Supplementary Figure S8). Both have con- 
tinuous positively charged surfaces, with the latter 
(Supplementary Figure S8, right panel) being sufficiently 
large to bind a ssRNA >10bp, which could be especially 
suited for unwinding complex secondary structures. This 
organization suggests that, after unwinding, one of the 
separated RNA chains would be guided through the 
narrow nucleic acid substrate tunnel formed by domains 
lA, IB and 2A, while the path of the other strand remains 
to be defined. No matter which cleft is actually used for 
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Figure 8. Putative protein interaction surfaces of the EAV nsplOA ZBD. (A) Overview of surface charges of nsplOA. The putative protein inter- 
action surfaces are indicated by a yellow circle. (B) Close-up view of zone 1 (orange) and zone 2 (magenta). Hydrophobic residues are shown as 
sticks. (C) Surface representation of the two putative interaction zones. The orientation is the same as in panel B. 



RNA binding, the positively charged ZBD, and especially 
its RING-like module, would be involved. 

Like the protein-binding surface of the Upfl C/H 
domain (54), ZBD has a putative protein interaction 
surface composed of two major hydrophobic zones 
that are almost perpendicular to each other (Figure 8). 
Nucleic acid binding induced a conformational change 
(Supplementary Figure S6B) of these two zones. In 
addition, the temperature factor of the treble-clef zinc fin- 
ger was higher and several residues are disordered in the 
structure of the nsplOA-DNA complex (Supplementary 
Figure S2). Together, these findings imply that these two 
zones are readily accessible for interactions with other 
proteins, which may further influence nucleic acid binding. 

Substrate binding by nsplO is accompanied by struc- 
tural changes in domain IB and the treble-clef zinc 
finger, which may be recognized by yet-to-be identified 
interaction partners modulating nsplO function. The 
treble-clef finger is fairly distant from the bound substrate, 
suggesting long-distance signal transduction within nsplO, 
possibly involving helix a4, which interacts with lA, IB 
and nucleic acid and is directly connected to the treble-clef 
zinc finger. The flexibility of the hinge region connecting 
the treble-clef zinc finger and helix a4 is likely 
compromised by the previously described S59P and 
S59G/P60G mutations that, importantly, were found to 
impair viral sg mRNA synthesis but not genome replica- 
tion (36). Consequently, the described inter-domain com- 
munication channel may be used by nsplO and its partners 
for switching from a role in genome replication to direct- 
ing viral transcription, a hypothesis that will be the subject 
of future studies. 

Nidovirus helicase: a role in post-transcriptional quality 
control of viral mRNAs? 

The observed structural affinity between the EAV nsplO 
and Upfl helicases is most remarkable, in particular 



because it extends to include the multi-domain 
organization essential for helicase function. This 
organization is only found in Upfl of all eukaryotes (59) 
and nidovirus helicases (19,24,25,28). For Upfl, its con- 
servation was linked to the protein's universal role in post- 
transcriptional quahty control of eukaryotic RNAs 
through multiple pathways, including nonsense-mediated 
mRNA decay (54-56). Upfl interacts, commonly through 
its C/H and lA domains, with proteins that can modulate 
its function. For the nidovirus helicase subunit, the func- 
tional basis of its domain conservation remains to be 
firmly established, although ZBD— hke C/H in Upfl 
(63) — affects helicase activity (36,37). 

If the nidovirus helicase possesses some of the 
properties of Upfl, this could explain the exclusive con- 
servation of ZBD in nidoviruses, which stand out for 
their large to very large single-stranded RNA genomes. 
For instance, by providing post-transcriptional quahty 
control of genomic RNA, i.e. detection of nonsense and/ 
or other mutations and elimination of defective molecules, 
the nidovirus hehcase could alleviate the consequences of 
the generally low fidelity of RNA virus genome replica- 
tion. Such a role of ZBD-HELl may have protected an 
ancestral nidovirus from the mutational meltdown of its 
expanding genome, similar to the proposed fixation of the 
proofreading ExoN domain at a later stage of nidovirus 
evolution (19,24,25,28). Subsequently, the enzyme would 
have facihtated expansion to the genome size observed in 
contemporary arteriviruses, and remained a critical factor 
in the further ExoN-driven genome expansion to 
evolve middle- and large-sized nidoviruses. Thus, the 
proposed Upfl -hke role of the nidovirus helicase can be 
accommodated in a meaningful evolutionary scenario 
incorporating several of the structural and functional ob- 
servations made in this study. The structural similarity 
between nsplO and Upfl estabhshes a new connection 
between research on viral and cellular helicases, which 
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could be mutually insightful for understanding the evolu- 
tion and function of this group of vitally important 
enzymes. 
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